Conversation
|
All contributors have signed the DCO ✍️ ✅ |
Collaborator
Author
|
I have read the DCO document and I hereby sign the DCO. |
The create-spike skill was splitting findings between the issue body and a follow-up comment. This made spike results harder to read and review. Merge the technical investigation section into the issue body template and remove the comment-posting step entirely. Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
Closes #32 Add post-condition checks in drop_privileges() to verify that setgid() and setuid() actually changed the effective IDs. Also verify that setuid(0) fails after dropping privileges, confirming root cannot be re-acquired. This is a defense-in-depth hardening measure per CWE-250 and CERT POS37-C. All added syscalls (geteuid, getegid, setuid) are async-signal-safe, so they are safe in the pre_exec context. Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
Closes #130 Rename the TUI from "Gator" to use generic naming: - CLI subcommand: `nemoclaw gator` → `nemoclaw term` - TUI title bar: "Gator" → "NemoClaw" - Docs/skills: "Gator" → "the TUI" or "NemoClaw TUI" - File renames: gator.toml → term.toml, gator.md → tui.md - Remove dead link to nonexistent plans/gator-tui.md Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
…135) * feat(policy): add validation layer to reject unsafe sandbox policies Add policy validation that checks for root process identity, path traversal sequences, overly broad filesystem paths, and exceeding filesystem rule limits. Validation runs at three entry points: disk-loaded YAML policies (fallback to restrictive default on violation), gRPC CreateSandbox, and gRPC UpdateSandboxPolicy (returns INVALID_ARGUMENT). Filesystem paths are normalized before storage to collapse traversal components. Closes #33 * fix(e2e): correct policy update test to match immutable field behavior The update policy test was asserting on validation errors for fields (process, filesystem) that are immutable on live sandboxes. The server rejects changes to these fields before validation runs. Updated the test to verify the immutability guard instead. --------- Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
Closes #27 Add remove() methods to TracingLogBus, PlatformEventBus, and SandboxWatchBus to clean up entries when sandboxes are deleted. Wire cleanup into both handle_deleted (K8s reconciler) and delete_sandbox (gRPC handler). Reorder watch_sandbox to validate sandbox existence before subscribing to buses, preventing entries for non-existent IDs. Add one-time sandbox validation at stream open in push_sandbox_logs. Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
…140) Closes #26 All list RPCs (ListSandboxes, ListProviders, ListSandboxPolicies, ListInferenceRoutes) passed the client-provided limit directly to SQL queries with no upper bound. A client could send limit=u32::MAX and cause the server to load all records into memory, risking OOM. This introduces a MAX_PAGE_SIZE constant (1000) and a clamp_limit helper that caps the limit in every list handler before it reaches the persistence layer. Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
…ter, and sandbox images (#144)
…tion (#145) * fix(server): add field-level size limits to sandbox and provider creation Closes #24 Add validate_sandbox_spec and provider field validation with named constants. Configure explicit 1MB tonic max_decoding_message_size. Inference routes excluded per #133 rearchitecture. * chore: remove issue number references from code comments --------- Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
…move implicit catch-all (#146) Remove multi-route CRUD system and replace with single managed cluster route (inference.local). Key changes: - Remove inference route CRUD RPCs and CLI commands - Remove InspectForInference OPA action; policy is binary allow/deny - Introduce AuthHeader enum and InferenceProviderProfile in navigator-core - Router is now provider-agnostic: auth style carried on ResolvedRoute - Replace InferenceRouteSpec with ClusterInferenceConfig (2 fields vs 8) - Rename proto: routing_hint->name, SandboxResolvedRoute->ResolvedRoute, GetSandboxInferenceBundle->GetInferenceBundle, drop sandbox_id param - Rename RouteConfig.route -> RouteConfig.name; use inference.local - Add 'nemoclaw cluster inference update' for partial config changes - Delete stale navigator.inference.v1.rs checked-in proto file - Update architecture docs, agent skills, and CLI reference Closes #133
Collaborator
|
FYI @kirit93, I just merged the local inference PR: NVIDIA/NemoClaw#146 and this will impact docs around it (pretty much a full rewrite). I can take a pass and push to this branch, would that work? |
Pass the computed cargo version through Docker and cluster packaging paths so deployed clusters report the built artifact version while keeping latest and explicit version tags aligned.
The reusable Docker build workflow runs under sh by default, and the compute-version step only uses a single command substitution. Removing pipefail avoids the illegal option error without changing the explicit tag-fetch setup.
#158) * feat(proxy): support plain HTTP forward proxy for private IP endpoints Add forward proxy mode to the sandbox proxy so that standard HTTP libraries (httpx, requests, etc.) work with HTTP_PROXY for plain HTTP calls to private IP endpoints. Previously, non-CONNECT methods were unconditionally rejected with 403. The forward proxy path requires all three conditions to be met: - OPA policy explicitly allows the destination - The matched endpoint has allowed_ips configured - All resolved IPs are RFC 1918 private This ensures plain HTTP never reaches the public internet while enabling seamless access to internal services without custom CONNECT tunnel code. Implementation: - parse_proxy_uri(): parses absolute-form URIs into components - rewrite_forward_request(): rewrites to origin-form, strips hop-by-hop headers, adds Via and Connection: close - handle_forward_proxy(): full handler with OPA eval, SSRF checks, private-IP gate, upstream connect, and bidirectional relay - Updated dispatch in handle_tcp_connection to route non-CONNECT methods Includes 14 unit tests and 6 E2E tests (FWD-1 through FWD-6). CONNECT path remains completely untouched. Closes #155 * fix(proxy): remove InspectForInference match arm removed by #146 The inference routing simplification in #146 reduced NetworkAction to Allow/Deny, removing InspectForInference. Drop the dead match arm from handle_forward_proxy. * fix(sandbox): restore BestEffort as default Landlock compatibility The Landlock V2 upgrade in #151 changed the default from BestEffort to HardRequirement. This causes all proxy-mode sandboxes to crash with Permission denied when the policy omits the landlock field, because the child process gets locked to only /etc/navigator-tls and /sandbox. Restore BestEffort as the default so policies without an explicit landlock field degrade gracefully. Fixes #161 * fix(sandbox): inject baseline filesystem paths for proxy-mode sandboxes Proxy-mode sandboxes need baseline filesystem paths (/usr, /lib, /etc, /app, /var/log read-only; /sandbox, /tmp read-write) for the child process to function under Landlock. Without these, the child can't exec binaries, resolve DNS, or load shared libraries. The supervisor now enriches the policy with these baseline paths at startup, covering both standalone (file) and gateway (gRPC) modes. For gateway mode, the enriched policy is synced back so users see the effective policy via 'nemoclaw sandbox get'. The gateway validation is relaxed to allow additive filesystem changes (new paths can be added, existing paths cannot be removed) to support the supervisor's enrichment sync-back. Includes 2 E2E tests: BFS-1 (missing filesystem_policy) and BFS-2 (incomplete filesystem_policy). Fixes #161 * fix(e2e): update assertion for relaxed filesystem validation message --------- Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
…, merged network-access-rules into policies) Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Added documentation for NemoClaw.